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pescribtion 

METHOD FOR IDENTIFYING STRUCTURALLY ACTIVE COMPOUNDS 
US TNG CONFORMATIONAL MEMORI ES 

Intirodtictioh 

The present invention relates to a method for 
predicting the conformation and functionality of a 
molecule comprising the steps of , first, performing 
multiple annealing runs in order to reveal 

populated and unpopulated regions of multidimensional 
conformation space , and ,. second; performing a simula- 
tion at a fixed temperature, with sampling only from 
populated regions found in the first step. 

Background of the Invention 

The insights gained from simulations, and the 
growing prevalence of relatively inexpensive computer 
o-igW^rV::l» : 8^1'«fl"' to the widespread use of many computa- 
tional techniques. The successes of these methods has 
:~-t$ : : :|rbi^tec[ the continuing development of new methods to 
study more and more complex problems. Early chemical 
simulations, for example, were used to estimate equili- 
brium statistical mechanical quantities and transport 
properties on collections of point particles whose 
10 interactions: were governed by simple potentials (Alder 
auid Wainwright r Phys . Rev. Lett., 1£:988 (1967) ; Hoover 
and Ree, J. Chem Phys. 12:3609 (1968); Rahman, Phys. 
Rev. 126:A405 (1964); Verlet, Phys. Rev. 1£9:98 
(1967)). today, simulations are used to compute free 
15 energies (Jorgensen, Acc. Chem. Res. 22:184 (1989); 
Kollman* Chem. Rev. 22:2395 (1993)) arid to study 
complex systems like protein structure (McCammon and 
Harvey, in pynamics of Proteins an d Nucleic Acids, 
Cambridge University Press, New York, N.Y. (1987)). 
20 Increasing demands naturally lead to increasing dif- 
ficulties. One of the great difficulties in compu- 
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tatibnal chemistry is the simulation of flexible 
organic and biological molecules. These systems are 
problematic because they belong to the general class of 
problems known as multiple time scale problems 
5 (Brackbill and Cohen, in Multiple Time Scales, Academic 
press, Orlando Florida, (1985) ) . Flexible organic 
molecules belong to this class because bond length and 
bond angle motion occurs on a femtosecond to picosecond 
time scale, While torsional motion may occur on a nano- 
ib second time scale or longer . Since true convergence of 
statistical mechanical properties regUires multiple 
intercon vers ions between all torsional states, 
obtaining stable averages may require simulations on 
the order of tens or even hundreds of nanoseconds. 
15 Typically, chemical systems are simulated 

using Monte Carlo ("MC" Metropolis fit al. , J . Chem 
Phys. 21:187 (1953) ) or molecular dynamics ("MD"; Allen 
and Tildes ley, <n fanputer simulations of Liquids, 
(Clarendon, Oxford, (1987) ). The recognition that the 
20 study of complex systems may overtax these standard 

methods has led to much work on the development of more 
powerful techniques. Several different variations and 
extensions of these two basic procedures have been 
tried in an attempt to find more efficient methods. 
25 These new algorithms generally fall into two broad 

classes: MD methods with the addition of some random 
character, (Anderson, J. Chem Phys. 72:2384 (1980); van 
Gunsteren and Berendsen, Mol. Sim. , 1: 173 (1988) ) and 
MC methods which utilize some partial deterministic 
30 character to generate better trial moves (Brass et al 
Biopol., 22:1307 (1993); Heerman et al. , Cbmp. Phys 
Comm. , £fl: 311 (1990) ; Cao and Berne, J. Chem Phys., 
22:1980 (1990); Rao and Berne, J. Chem Phys., 21:129 
(1979); Rossky et al., J. Chem Phys., £9:4628 (1978)) 
35 The most recent algorithm, the mixed MC-SD algorithm, 
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is a pure hybrid that uses MP arid MC methods equally 
(Burger et al. , J . Amer . Chem * Soc . , 116 : 8 , 3593 );. 

In the MC method, increases in efficiency may be . 
obtained by generating nonuniform random trial moves . 
5 If this is done over the same surf ace being studied, it 
is known as biased sampling • If the searching • is;:;i'car-- 
ried out over a potential surface that is diff Orient 
from the surface being studied, it is characterized as 
importaince sampling . Since importance sampling is <^ 

10 over a different surface than the one actually being 
studied, it is necessary to appropriately weight the 
statistical mechanical averages (Kalos and Whitlock, 
Monte Carlo Methods vol 1, (Wiley, New York, N.yV 
1986)). In biased sampling, since the searching is 

15 being done over the same potential surface as the 

actual surface under study, no weighting of the aver- 
ages is necessary • In biased sampling, MG trial moves 
are based on some a priori knowledge of the space to be 
sampled* Regions that are more "important" are sampled 

20 with greater f reguency in the full expectation that the 
speed of convergence will be enhanced. In practice, a 
simulation employing biased sampling is done in two 
steps: some initial procedure is employed to reveal 
(or guess) the important gross features, then an exten- 

25 sive search utilizing this information is performed. 

Obviously, this procedure will be valid and useful only 
if it is faster than a standard sampling procedure, arid 
if it introduces no spurious artifacts* 

30 STOPttRY Of THB XyVBHTIfiB 

The present invention relates to a method for 
identifying structurally active molecules comprising 
the steps of , first, performing multiple simulated 
annealing runs in order to reveal populated and unpop- 

35 ulated regions of multidimensional conformation space, 
and, second, performing a simulation at a fixed tern- 
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perature, with sampling only from populated regions 
found ' in the first step. ■ It is based, at least in : • 
part, on the discovery that the method of the invention 
could be used to sort a large family of analogs of 

5 gonadotrophin releasing honnone ( w Gnra analogs") into 
groups having low or high Affinity for tiie GhRH recep- 
tor The method of the present invention of f ers the 
advantage that, since the simulated annealing runs 
quickly reveal unpopulated regions of the conformation 

10 space , the volume of conformation space that needs to 
be sampled in the second phase pf the algorithm is 
reduced by many orders of magnitude. Additionally, 
since no energy minimization is used> these populations 
represent a canonical ensemble which may be used to 

15 estimate conformational free energies. 

n^aBriPt<Q" of tf"» Figures 

Figure 1. Structure of LTB 4 . 

Figure 2. A Flex-Map or "Conformational Memory" 
20 of dihedral 1 from LTB 4 . 

Figure 3. Graphical Representation of the Mapping 
of Rand Numbers onto the dihedral distribution. Random 
numbers betWeen 0-1 determine dihedral values (using 
the -~- line) . For example, 0.6 maps to +65° . 
25 Figure 4. Plot of the average LTB 4 conformer 

energy vs temperature for ten normal SA runs and ten 
Smart-SA runs. Very rapid energy lowering is possible 
using Smart-SA, although the ultimate energy is 
similar. Each 10 run set required -16 hrs of CPU time 
30 on a yax 8600 and represents 100, 000 conf ormers each. 
Figure 5. Molecular structure of the 
gonadotropin-releasing hormone (GnRH) . The 35 
rotatable torsional angles are indicated by arrows. 
Figure 6. Conformational Memories of selected 
35 dihedral angles in the gonadotropin-releasing hormone 
(see Fig. fffor identity of the angles, a. Dihedral 
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angle 4; b. Dihedral angle 6; c. Dihedral angle 7; d. 
Dihedral angle 35. 

Figure 7. Conformational Memory difference maps 
of dihedral angle 20 in £nRH (see Fig • S) • The dif- 
5 f erence maps were created by subtracting the Conf orma- 
tional Memories from A. 25 a^d 10 runis> B. 5>0 and 2 5 
runs , C. 75 and 50 riihis # b . 100 and 75 runs. Note that 
in almost all regions differences are <i%. Conforma- 
tional Memory biff erence Maps of the other dihedrals 

10 are very similar. 

Figure 8 V A sequences of 3 1 OK temperature slices 
from the Conformational Memory of bond 17/ calculated 
with a- 25 runs ; b . 50 runs ; c . 75 runs; d; 100 rims . 
Note the symmetrically equivalent population distri- 

15 but ions centered about -90 and 90. 

Figure 9. The choice of dihedral angle values in 
tlie biased sampling of the populated region of the 
Conformational Memories. The illustration iis for 
dihedral angle 19 .Panel (a) shows a histogram repre- 

20 sent at ion of the probability distribution for the 
dihedral angleypanel (bj shows the cumulative 
probability distribution for dihedral angle 19. Since 
the random number generator is a cumulative probability 
distribution , biased sampling is done from the histo- 

25 gram in part (b) If the random number 0 . 2 is 

generated , which corresponds to the second block of the 
histogram in part (b) , the hew trial dihedral will be 
chosen from the interval -170 to -160 degrees with the 
actual value obtained from a linear interpolation 

30 within this interval. If the random number 0.4 is 

generated/ which corresponds to the 28th block of the 
histogram in part (b) , the new trial dihedral will be 
chosen from the interval 90 to 100 degrees. Note that 
the region -60 and 60, which has no population in part 

35 (a) , is automatically skipped when sampling from part 
(b). 
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Figure 10. Backbone trace of a representative of 
the five conformational families of GnRH obtained from 
Conformational Memories. Structures with a beta^type 
turn have a 70% population. Structures with a straight 
5 babJcbohe; have approximately 5% population. 

Figure 11. Super imposition of 70 structures that 
make up the major conformational family of GnRH 
obtained from Conformational memories. While there is 
a large amount of fluctuation in the backbone, and an 
10 even greater amount of fluctuation in the side chains 
: . (especially Arg8) , there is a clear beta-type turn from 
residues 5-8 in this family. 

Figure 12. Backbone trace of a representative of 
the two conformational families of Lys8-GnRH obtained 
15 from Conformational Memories. The structure with the 
beta-type turn comes from a family with an approxi- 
mately 3% population. The structure with the straight 
backbone comes from a family with approximately that of 
70% population. 
20 Figure 13. Super imposition of 70 structures that 

make up the major conformational family of Lys8-GnRH 
obtained from Conformational Memories, While there is 
a large amount of fluctuation in the backbone, and an 
even greater amount of fluctuation in the side chains 
25 (especially Lys8) , the backbone is clearly extended. 

Figure 14. superimposition of a high affinity 
GnRH cyclic analog (Structure I) and a representative 
of the major GnRH conformational family (structure II) . 
Eleven backbone atoms from residues 5-8 were used for 

30 the superimposition. 

Bejftj laa neaerlpt-.4on of the invention 

For purposes of clarity of description, and not by 

way of limitation, the present invention is described 

by way of two examples. First, the method of the 
35 invention is applied to determining the structure of 

leukotrienes. Second, the method of the invention is 
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used to identify GnRH analogs which have a high 
affinity of binding to the GnRH receptor . 

ExamPler 1^ I^ukotriene Structure 

5 Leukotr ienes for example, are an important class 

of natural antiinflammatory agents (Sammuelsson et al • , 
Prostaglandins/ M-iiB5 11979) ; "The Leukotr ienes" : 
Their Biological Significance," P.J. Piper, Ed. , Raven 
Press, N . Y. , (i986))> Understanding the bioactive con^ 
10 formations of a key member of this class such as LTB 4 
(Figure 1) , involves conformational analysis of 14 
flexible dihedrals . 

This is, however; an extremely difficult problem. 
For a description of "Impossible* 1 computational pro- 
15 blems see: Wv Garey, Computers and Intractability, (H. 
Freeman airi& tti, N^bw York , N; Y . 1979) . Even con- 
sidering only a three state model around each bond 
(anti and +/- gauche) there are 3 14 possible con- 
formations (Kirkpatrick, et al. , Science, 220:671 
20 (1983) / Simulated Annealing and Optimization, M.W. 

Johnson, Ed., American Sciences Press, Syracuse, N.Y. 
(1988) ) (1 , 594 ,323 ) 4 A recent case study on the 
conformational analysis of cycloheptadecane (Saunders 
et al. , J. Amer. Ghem. Soc . 112 ; 1419 (1990) ) which 
25 state is equivalent to a 12 -dimensional nonsymmetric 
problem, nicely illustrates the difficulties of 
searching a multidimensional conformational space. 

Smart-Simulated Annealing? The Learning Ph?se 

30 To address this problem we have developed a con- 

formational analysis technique which combines simulated 
annealing (SA) (Kirkpatrick, et al. , Science, 22ft: 671 
(1983); Simulated Annealing and Optimization, M.W. 
Johnson, Ed. , American Sciences Press, Syracuse, N.Y. 

35 (1988) ) ; Wilson et al. , Tetrahedron Letters, 4343 
(1988); Wilson et al., Proceedings of the Seventh 
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WOrkshop on vitamin D, Walter de Gruyter Co. (i?88>; 
Wilson and Cui, Biopolymers, 2£:225 (1990) ; Wilson et 
al.> J. Comp. Chen., 12 , 3 , 342 ( 1991) ) and biased 
sampling (Kalos and Whit lock, Monte Carlo Methods Vol 
5 i, Wiley, New York, N.Y, 1986) into a type of learning 
algorithm (Caudill, Expert, 12/89 , 4/90, 6/90; Judd, in 
NmiraV He+wm-Tc Design and the Complexity of Earning, 
MIT Press, Cambridge, MA (1990); Statistical Mechanics 
~ f m n .i networks . Luis Garrido, Ed, Springer-Verlag 
10 New York, N.Y. (1990); Lacey, Ed., Neural Networks , 

Tetrahedron Computer Methodology , 1990, 3). The method 
is a 2-stage process made up of a learning phase and an 
implementation phase. The learning phase starts by 
randomly sampling the dihedral space of all flexible 
15 bonds using the simulated annealing algorithm 

(Kirkpatrick et al . , Science, 220:671 (1983) ; Simulated 
Annealing rvpfr i mi zatlon. M.W. Johnson, Ed- , 
(American Sciences Press , Syracuse , N.Y. (1988)). The 
entire 360« continuous dihedral space of all flexible 
20 torsional angles is sampled in accordance with the 

fundamental hypothesis of equal a priori probabilities 
•(T~1ir*r <- ■TF*n**rii** of Statistical Mechanics, 
(Dover Press, New York (1971) ) . To provide our 
knowledge-base, multiple SA runs are performed and for 
25 each step, the chosen dihedral, value of the chosen 
dihedral and conformation energy at that step are 
recorded (F. Guamieri, Ph.D., Thesis, New York 
University (1992)). This series of log files is con- 
verted into population distributions by summing and/or 
30 averaging the number of hits in ten degree intervals . 
The conformation space of the antiinflammatory agent 
LTB 4 , which has fourteen rotatable dihedral angles, 
gives us the Flex-Map (Wilson and Guarnieri, Tetra- 
hedron Lett., 22:3601 (1991)) plots of these fourteen 
35 bonds. One typical Flex-Map is shown in Figure 2. 
These plots contain information of the overall 
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population distribution for each rotatable bond as a 
function of temperature .- Since th^ Is 
in f lux witih all energetic interactions taken into 
accord at every step, the Flex-Maps are mean field 
5 population distributions with no approximations. 

Hence, they are a true canonical ensemble with respect 
to all flexible torsional states of the molecule. 
These maps rapidly reveal occupied regions of dihedral 
space and ^dead zones" Which are totally deybi<d of 
to cohf prmat iphis at any temperature. These "dead zones" 
are the key to why It is so difficult to search tlie 
conformation space of flexible molecules with many 
rotatable torsionals . Most methods; sample from the 
whole space throughout a conformation search. Clearly 
15 these dihedral distributions, which we now call con- 
formational memories , indicate that sampling from many 
regions is a complete waste of tiine. 

It is self-evident that it will be vastly more 
efficient to sample from the smaller space obtained 
20 from the elimination of "dead zones? compared to 

sampling the original space. The key point is to make 
sure that thermally accessible regions ire not errpnr 
eously labeled as "dead zones" because the second phase 
of the simulation would be flawed. Thus, in the first 
25 part of the simulation, care must be taken to insure a 
good sampling. This is why repeated simulated 
annealing runs are performed in the initial phase. 
Multiple runs from different random starting geometries 
using high temperatures and large Monte Carlo steps 
30 have the best chance of sampling in every region. We 
would like to point out that a comparable simulated 
annealing strategy was shown to be capable of searching 
the entire conformation space of cycloheptadecane 
(Wilson and Guarnieri, Tetrahedron Lett. .32; 3601 
35 (1991) ; Guarnieri and Wilson, Tetrahedron* 48:4271 

(1992); Guarnieri et al., J. Chen Soc, Chem. Comm., 
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ii:1542 (1991) y in less than 48 tes on #»icrwax. 
in contrast, the aforementioned effort (Saunders et 
; ; ; a i., J. Amer. Chem.Soc., 112.: 1419 (1990)) using most 
- ' known seatch met^^sl tobk>^t 2 CPO years on a .. 
5 Bicroyax. For mote complicated systems such as LT3 4 , 
we are confident that compilations of repeated runs 
started from different configurations, using different 
random number seeds, and initialized with a thermal 
energy of over 1000K, reveals the populated and 
10 unpopulated regions of the 14 dimensional torsional 
space of LTB 4 . In fact, the unpopulated regions are 
revealed particularly early on in the simulation. In 
compiling 5, 10, 15 and 20 runs, the iratibs of the 
populated regions change (to a very small degree in 
15 going from 15 to 20 runs), but there is virtually no 
change in unpopulated regions throughout this 
progression. For LTB 4 the unpopulated regions make up 
more than half of the total conformational space at 
200K. A sampling strategy which avoids these "dead 
20 zones" would reduce the volume of conformation space 

that needs to be searched from 360 14 to less than 180 14 , 
where 14 is the dimensionality of the space, and 360 is 
the extent of one dimension. 

smart-Slm»M^«d Ann«»»Hna: The Implementation Phage 
25 The implementation phase involves utilizing the 

information contained in the 14 conformational 
memories , an example of which is shown in figure 2 . 
This is done by again running the SA Metropolis 
algorithm, but instead of selecting hew trial con- 
30 formations at random over the whole dihedral circle, we 
select new trial conformations by sampling only from 
populated regions of the conformational memory for each 
bond at a given temperature. To search for low energy 
conformations, the populations at 200K were chosen, T 
35 call this technique of biased sampling with simulated 
annealing Smart-SA. 



We 
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A new procedure is needed to sample a dihedral 
space embedded with "dead zones. " In order to carry 
out this biased sampling, it is necessary to map the 
uniformly distributed random numbers produced by 
5 standard random number generators onto the con- 
formational memories* This process is illustrated in 
figure 3. The conformational memory in f igure 2 may b^ 
approximate^^ state model. Figure 3 

shows mapping of random numbers into this distribution. 
10 To perform this mapping, our algorithm requires infor- 
mation on interval and popula- 
tion of ^ the numbier of 
states is actually four instead of three because 
dihedral space goes from -180 to +180) . The shaded 
15 regions are the "dead 2 thus are never 

sampled. Since the first: region has a 1/3 probability 
of being surveyed, if the generated random number is 
betweseh zer*> and 1/3, the new dihedral is selected from 
this first region. Tlte ex|bt valu that this bond will 
20 be set to is obtained by starting at the point oh the 
ordinate at the value of random number, moving horizon- 
tally until the dot^ a vertical 
from ttoat point: , aridil^ value that 
arises from the intersection of the vertical with the 
25 abscissa. For ex^ maps to a 
dihedral value of 65°. This angle is passed to the 
dihedral driver to construct the new conformation. 
Once this conformation is created, the algorithm passes 
back to our standard simulated annealing routines 
30 f Kirkpatrick et ^ (1983) ; Simulated 
Annealing and 0?fri^ MvW . Johnson , Ed., American 
Sciences Press, Syracuse, N;Y. (1988) ; Wilson et al. , 
Tetrahedron Letters, 4343 (1988) ; Wilson et al. , 
Proceedings of the Seventh Workshop on Vitamin D, 
35 Walter de Gruyter Co . , (1988) ; Wilson and Cui, Biopoly- 
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#ers; 22:225 (1990) ; Wilson et al . ,. J. comp. Chan. 12 , 
3 , 342 (1991) . ' . 

Results 

5 The LTB4 problem above was run using the same SA 

control datal as; previously reported (Kirkpatrick et 
al. i science^ 220:671 (1983) ; simulated Annealing and 
optimization, M.W. Johnson, Ed., (American Sciences 
Pres^, S^acus^, N.Y. (1988) ) . Ten runs of SA and 
10 Smart-SA were carried out on LTB 4 (500 steps at 25 

temperatures) . The convergence results are shown in 
figure 4. Much faster lowering of the conformer energy 
occurs producing rapid convergence. 

On a system as complicated as LTB 4 it is impos- 
15 sible to prove that a conformation search is complete. 
One usual measure of comprehensiveness is to perform 
repeated searches using different initial conditions 
until the output of several searches produce the same 
results. Ten ordinary simulated annealing runs on LTB 4 
20 produced ten different low energy conformations. Ten 

runs of Smart-SA with the conformational memories taken 
at 200K produced two conformations (6 of one and 4 of 
the other) which were both lower in energy than any of 
the ten conformational found by ordinary simulated 
25 annealing. 

Computational /Details 

Since many of the computational details have been 

reported; (Wilson et al . , Tetrahedron Letters, 4343 
(1988); Wilson et al., Proceedings of the Seventh 

30 Workshop on Vitamin D, Walter de Gruyter Co. , (1988) ; 
Wilson and Cui, Biopolymers, 22:225 (1990) ; Wilson et 
al., J. Comp. Chem. 12, 3, 342 (1991); F. Guarnieri, 
Ph . D . thesis , : r New York University, 1992; Wilson and 
Guarnieri, Tetrahdron Lett. 32:3601 (1991)) only a 

35 brief summary will be outlined. All runs were started 
with beta=0.11. This corresponds to a temperature of 
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1093K given that bet^l/(RT) where R=8 . 3 14e3 kJ/mol . 
After every block of steps beta is multiplied by 1.1 to 
reduce the temperature, in theory, the cooling sched- 
ule should be controlled and varied as a function of 
5 the lieat capacity . In practice, we have found that 
balancing the need for slow cooling and obtaining 
acceptable CPU performance is a reasonable compromise. 
At each step the dihedral angles that are rotated to 
v:;cfcfe^^ noted. Whether the 

10 trial configuration is accepted or rejected, the 
^ the related dihedrals , the extent of the 

rotation* the value of the energy of the trial con- 
figuration and the new dihedral values if the trial 
conf ormat iph Is accepted or the old dihedral values if 
the trial conformation is rejected are recorded to the 
log file in temperature blocks . One log file is 
created for each run A utility program inputs all of 
this raw data and combines it according to temperature 
blocks. This data is output in comma delimited format 
20 so that it can be imported into deltagraph (Deltagraph 
■■■;7 : : ..-'" ./':TM ; V versibn ; 1.0, Copyright Deltapoint, Inc. , 200 

Heritage Harbor, Suite G, Monterey, CA 93940) which is 
used to plot the conformational memories. Using 
another utility program or manually, a temperature 
25 slice from the conformational memory is extracted. In 
the second phase of the simulation the sampling is done 
from a subroutine that performs the calculation shown 
in figure 3 instead of just using the standard random 
number generator . 
30 At the outset of the study we were faced with the 

nearly intractable 14 -dimensional problem. The 
learning phase of the simulation reveals that about 60% 
of the entire conformational space is unpopulated "dead 
zones 91 at 200K. Going into the implementation phase of 
35 the simulation, we were able to reduce the volume of 
the conformation space that needed to be sampled by 
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many orders of magnitude. Additionally, since several 
dihedrals remained exclusively in a trans conformation 
at ail temperatures throughout the many learning phase 
simulations, we were also able to reduce the dimensioh- 

5 ality of the search in the implementation phase by 

setting these dihedrals to a constant value of 180 with 
no loss of generality. To reiterate, Smart-SA 
(simulated annealing with biased sampling) allows a 
considerable reduction in the conformational space 

10 which needs to be sampled. Hence, much larger Systems, 
which have been generally considered computationally 
intractable, may now be studied. 

Having used this technique with much success in 
the area of conformation searching, we have begun 

15 exploring Smart-SA applications to quantitative 

problems such as free energy simulations . Preliminary 
results with sampling proportional to the height of the 
populated peaks has proven dependent upon the length of 
the learning phase. Only after getting quantitative 

2b convergence in the ratio of the populations are the 

final results invariant. This problem can be traced to 
a violation of detailed balance. Preliminary results 
of simulations sampling equally from all populated 
regions (with no sampling from "dead zones" ) , which 

25 should not violate detailed balance , have proven suc- 
cessful. 

Ban p j AJ _Ifl gaUIigftfciflB--gi motive qkrh analogs 

The key physiological role of the gonadotropin- 

30 releasing hormone ( [pGlul-His2-Trp3-Ser4-Tyr5-Gly6- 
Leu7-Arg8-Pro9-GlylO-HH ? ];GnRH) as a mediator of 
neuroendocrine regulation in the mammalian reproductive 
system has made it the object of intense study for 
several decades. The ability of GnRH and its analogs 

35 to modulate the pituitary-gonadal axis has made them 
essential therapeutic agents in the treatment of a 
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variety of disorders ranging from infertility to 
prostatic carcinoma (Casper, Can Med Assoc J. HI: 153- 
160 (1991) ; Barbieri, Trends Endocrinol Metab 2: 30r34) . 
Conformational studies have played a central role in 
5 the quest for understanding the structural basis for 
the activities of tJiese peptides, as well as in 
attempts to design new analogs with improved pharma- 
cological properties. Investigations of the biological 
mechanisms Underlying the actions of the GnRH are 
10 quite difficult because small peptides are extremely 

flexible. Spectroscopic techniques, for example, indi- 
cate that a multitude of inter convert ing conformers 
exist simultaneously. In an attempt to pare away some 
of the many thermally accessible but biologically 
15 irrelevant conformations, several investigators have 
synthes i zed restricted GnRH ana logs (Rizo e t a 1 . , J. 
Amer. Chem Soc . , HI: 2852 ( 1992 ) ; Bienstock , et al. , J. 
Med Chem., Ifi: 3265 (1993) | . This approach has proven 
useful for defining some structural motifs of anta- 
20 gonists of the hormone. Obtaining comprehensive 

detailed molecular conformational properties, however, 
such as the specific dihedral values of the bioactive 
forms, is a formidable task given that the GnRH has 35 
rotatable bonds as shown in figure 5. 
25 The inherent complexities of small flexible pep^ 

tides has motivated sutides which combine computational 
and experimental techniques (Young and Hicks, Biopbly . , 
3J.2 611 (1994)). The computational method of choice is 
molecular dynamics. While dynamical techniques are 
30 capable of revealing short time scale molecular 
motions, these methods are generally incapable of 
exploring the ensemble of conformational states that 
exist in flexible molecules (Guarnieri and still, J. 
Comp. Chem., IS: 1302 (1994)). To explore the whole 
35 ensemble of conformational states that exist in the 

GnRH, we have used the recently developed technique of 
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conformational memories (Wilson .^y&uii^^i^ , 
hedrbn l«tt ^: 3«6l (^91>| . Her^ iw show >£na;fc 
application of this technique can yield converged 
dihedral populations of all 35 rotatable bonds of the 
5 peptide . GnRH with no approximations. Samples from 
the conf ormational 

memory biased sampling technique were used to charac- 
terize the conformational families of GnRH and several 
of its analogs, in an aqueous environment modeled with 
10 the generalized Born/surface area (GB/SA) method. 

(Still et al. , J. Amer. Chem Soc. , 112:6127 (1990) ) . 
This analysis reveals the conformational preferences of 
the GnRH and its analogs, and suggests some of the 
structural determinants for their biological function. 

■15 

M«thod of nanf ormational Memories 

The simulation technique of conformational 
memories is a two stage process consisting of ah 
exploratory phase and a biased sampling phase. In the 
20 exploratory phase repeated runs of Monte Carlo simu- 
lated annealing (MC/SA) (Kirkpatrick et al. , Science^ 
220:671 (1983)) are carried out in order to map out the 
entire conformational space of the flexible molecule. 
The construction of Conformational memories described 

25 below has been interfaced with the MaCro*odel (Mohamadi 
et al. , J. Comput. Chem. 11:440 ( 1990)) molecular 
modeling package version 5.0 so that the continuous 
GB/SA solvent model (Still et al. , J. Amer. Chem. Soc. 
112:6127-6129 (1990)), and the recently developed amino 

30 acid backbone torsional potentials (McDonald et al., 
Tetra. Lett. 22: 7743-7746 (1992) from the Macromodel 
package could be used in the present conformational 
study. As applied here, the MC/SA protocol for the 
exploratory phase was designed with a starting tempera- 

35 ture of 2070 and a cooling schedule of T n+1 =0.9*T n for 
nineteen discrete temperature points. At each tempera- 
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ture 10,000 steps were applied to title 35 rotatable 
bonds (fig. 5), cooling the system to a final tempera* 
ture of 310K. Trial Gonformatibns in the MC/SA routine 
were generated by randomly picking 2 rotatable bonds 
5 from among the 35, rotating eatch random value 

between - 18 q degrees, and ^ rejecting the 

trial conformation according to the standard Metropolis 
(Metropolis et al . , J. Chem Phys . > 21s 187 (1953)) cri- 
teria with a Boltzmann probability function defined at 

10 the given temperature* After e 

conf braatipn was acc^p^ the 
rotated bonds, the extent of rotation, the energy, and 
the value of the dihedral angles a^ to a "log 

file". An example of the output to the log file is 

15 given in Table 1. in this example, the first group of 
entries , corresponding to the first two lines , is the 
result of a rejected step as indicated by the zeros in 
the first column . The secorid^ an^ iden- 
tify the atom numbers of the bonds that were rotated to 

20 create the trial move (in this ex^ple atom40-atom41 

and atom47-atbm48) • The fourth column lists the extent 
of rotation of the torsion angle in decrees. The fifth 
column lists the total energy of ^ The 
sixth column holds the current dihedral value of the 

25 bond. 

The second group of entries in table 1, corres- 
ponding to the next two lines, lists^ t^ results of a 
trial rotation that was accented as a new conformation 
(as indicated by the digit one in the first column) . 

30 The current dihedral values given in the last column 
are the new values of the newly accepted conformation. 

Each run of MC/SA consists of a random walk of 
190,000 steps (19 temperatures, 10,000 steps per 
temperature). Because two lines of data are added to 

35 the log file for each Monte Carlo step, a single run 
creates a file of 380,000 lines. To explore the 
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conformations of the GnRH peptide in GB/S A water, we 
performed 157 of these simulations, creating log files 
of the different random walks. A 157 run MC/SA simula- 
tion requires about 12 days of computation on an SGI 
5 Challenge 200 MH 3 workstation. 

To obtain structural information fjrom this large 
amount of dita the log files are used as input to a; 
program (called Flex) that sorts, merges, and compacts 
the data in several ways. Since the simulations were 
10 done at 19 temperatures for each peptide, application 
of Flex first sorts and merges the data from all log 
files into 19 temperature blocks. Subsequently, within 
each temperature block, the data are partitioned into 

35 bond blocks^ pne f or each rotatable bond. For each 
15 rotatable bond, the dihedral angle space is partitioned 

into 36 ten degree intervals. From each line of data 
for a given bond at a given temperature, the program 
records the number of times that the bond dihedral 
angle value belongs to one of the fen degree buckets, 
20 i . e . a "Conformational Memory". Finally, the Flex 

program produces a 19x36 (recording 19 temperatures by 

36 10-degree diheral intervals with normalized popula- 
tions) spread sheet for each of the 35 rotatable bonds 
of the : GnRH peptide. An excerpt of one of these spread 

25 sheets is given in Table 2. The spreadsheets are 
imported into Delagraph (TM Version 1 . 0 > Copyright 
Deltapoint, Inc . 200 Heritage Harbor , Suite G , 
Monterey, CA 93940, (1987)) for; plotting and graphical 
representation of the data in the spreadsheets are 
30 given in Figures 6 A-D. Across the top of the spread- 
sheet are the dihedral angle values from -170<» to 180° 
which label the y-axes of Figures 6 A-D (note that the 
spreadsheet fragment is cut off at -100) . In the first 
column are the 19 temperatures which range from 2070 to 
35 310 which label the x-axes in Figures 6 A^D. The value 
in a spread sheet position corresponding to a given 
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temperature with a given ten degree dihedral bucket is 
the population percentage ^ plotted on the z- 

axes of Figures 6 A-b. 

The jprbc^ure f o memories 
5 for the dihedral angles results in an enormous compres- 
sion of the large yd needed to describe a 
35 dimensional hyi^rtorsipnal space • The condensation 
of the information iii Deltagraph plots yields identir 
fiable structural motifs. For example, bond 4 shown in 

10 f igure 6a, has a classic t^ee state distribution : 
trans , gauche+ and gauche* ; l^hd 6; the phi angle 
residue 3 (Fig- 6b) , has a continuous population dis^ 
tr ibut ion over a very large range from about -60 to r 
180 degrees and no population in the other regions ait 

15 any temperature. In contrast* bond 7 , the psi angle of 
residue 3 (Fig. 6c) , has a harrow all trans distri- 
bution. The distribution of bbrid 35 (Fig- 6&) , favors 
a trans conformation, but maintains significant popu- 
lation over the entire; d^ 

20 peratures. The construct ion of conformational memories 
has been interfaced with the Macromodel molecular 
modeling package (Mbhamadi et al . , J. Comp . Ghem. , 
11:4401 (1990) ) version 5.0 so that the continuum GB/SA 
solvent models and tJie recently deyelpp amino acid 

25 backbone torsional potentials (McDonald and Still, 
Tetra. Lett., 22:7743 (1992)) from the Macromodel 
package could be used in the present conformational 
study. 

30 Convergen ce of Conformational Memories 

By including the results from multiple 
explorations of all possible combinations of dihedral 
angle values for all rotatable bonds of the molecule, 
the thirty-five conformational memories provide a com- 

35 plete mapping of the conformational space of GnRH with 
no approximations, as long as the calculated popula- 
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tions are cphyexrged. In the original formulation of 
the method, population convergence was identified as 
the difficult and crucial aspect of forming conforma- 
tional memories (Wilson and Guarnieri, Tetrahedron 
5 Lett. , 22:3601 (1991 ) . Because the second phase of the 
' simulation, the biased sampling .explores only the 
parts of the conformational space identified as popu- 
lated regions, Population convergence ensures that 
regions that could be thermally accessible are not 
10 erroneously labeled as being unpopulated. The correct 
identification of the populated regions is essential 
for the second phase of the simulation, because the 
biased sampling only explores populated regions of the 
conformational space. 
15 Population convergence for the GnRH was confirmed 

in three different ways: by creating conformational 
memory difference maps for slmulatibrts of different 
length, by analyzing intrinsic symmetry; and by showing 
that there is no significant difference in the popula- 
20 tions of actual structures of GnRH created from 

Conformational Memories obtained from 25, 50, 75, 100 
and 157 independent MC/SA runs. Figure 3 shows the 
conformational Memory difference maps for dihedral 
angle 1, comparing simulation lengths of 10, 25, 50, 75 
25 arid too runs. The difference nap in Figure 3a is 
created by subtracting the Conformational Memory 
obtained from a 25 run MC/SA simulation from a 10 run 
MC/SA simulation, in Figure 3b the difference is 
between 50 and 25 runs, Figure 3c shows the difference 
30 between 75 and 50 runs, and Figure 3d is the difference 
map between 100 and 75 runs. The progression clearly 
shows the convergence. The other dihedral angles have 
very similar difference maps for this sequence of 
comparisons. 

35 A second measure of convergence is symmetry. 

Because dihedral angle 17 has a 2-fold axis of sym- 
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rietry, it is expected that the dihedral space of this 
V; bond will have symmetric popuiation di 
0 : ;';/: : : : 'terea ' at -90 and 90 degrees. A temperature slice at 

310K of this dihedral for 25, 50, 75 and 100 run MC/SA 
5 simulations isshown in Figure 8. The population dis- 
tributions clearly conform to the symmetry considera- 
tions. 

The third indication of convergence is the finding 
(see below) ihait biased Sampling f rom ^bnf prmatlo^ 
10 Memories created from 25, 50, 75, 100 and 157 MC/SA 
runs yield very similar prof |^es-;.o^ : ' : ;CS^;V^ 

Biased Sampling From Cohf ormational Menvoriies: 
Elimination of Barriers — 

15 

Once the con f ormat iona 1 niemor ies ar e e^ t abl i shed , 
a new Monte Carlo search is performed at 309K, sampling 
only from the populated regions. Because about 50% of 
the torsional space of the 35 bonds is populated at 
20 310K, so that the conformation space that needs to be 

explored in the biased sampling phase of the s imula t ion 
has been reduced without approximations, by many orders 
of magnitude. Tabl^ 3 is an excerpt of tte 
matrix fox* GriRH at 3 lOK. The dimehsibn of thi^ ferpba^- 
25 bility matrix is 35x36 for the 35 rotatable bonds 
partitioned into 36 buckets over the 360 degree 
dihedral space (note that only 11 of the 36 dihedral 
buckets and only 16 of the 35 rotatable torsional 
angles are shbwn in Table 3) . The fi*Wt line indicates 
30 that at 310K bond 1 is found in the -180 to -170 

dihedral interval 10 . 1% of the time. In contrast, the 
seventh column of the first row indicates that bond 1 
is never found in the dihedral interval -120 to -110 ait 
310K. 

35 The two stage process of developing Conformational 

Memories and then performing the biased sampling from 
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these distributions is necessary in order to sample the 
entire conformation space of the molecule. An 
obviously simpler alternative would be to limit the 
conformational exploration to standard Metropolis Monte 
5 carlo at 310K and monitor the development of the random 
" walk over torsional space. However, this simulation 
constitutes the last step in the development of the 
Conformational Memories for the temperature of 310K; it 
is clearly inadequate, as indicated by the acceptance 
10 rate. The acceptance rate is about 28% at 207OK, with 
a step size chosen randomly within the interval of +/- 
180 degrees and rotating two dihedrals selected 
randomly at each step. At 310K, using the same para- 
meters, the acceptance rate falls below 2%. Therefore, 
15 the sampling of the 35 dimensional dihedral space would 
be incomplete if these parameters were used for the 
Monte Carol random walk procedure at 310K. Even if the 
random interval from which trial configurations are 
sampled were reduced to +/-30 degrees (to increase the 
20 acceptance rate), sampling would still be insufficient 
because the majority of new conformations would be in 
the local area of the previous conformation. The +/- 
180 degree step size was deliberately chosen so that 
hew conformations can be created by jumping between 
25 wells without having to climb over barriers. A single 
simulated annealing run cannot be expected to cover 
i such a vast space, but cumulations of multiple runs 
while each of the runs performs a different random walk 
can be shown to converge, as illustrated in Figures 7 

30 and Figure 8. 

The restriction of the sampling to the populated 
regions identified in the previous step ( i, e. , the 
Conformational Memories) is achieved by partitioning 
the 0-1 interval of the random number generator into 

35 the 36 parts which correspond to the 36 separate 10- 
degree intervals for each rotatable dihedral angle. 
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The partitioning of the randpia ^jixa^^^^^^M^r ^ is 
proportional to the population of the h iQ^egree bucket . 
New biased trial boh^ 

randomly choosing two rotatable bond^ # generating a new 
5 random number for each bond, determining to which of 
the 36 intervals each new random number for each bond 
belongs, and driving the ^ appropriate 
intervals ^^ra 

determined by a linear interpolation. This procedure 
10 is illustrated in Figure 9V 

A : ma jor advantage of the : Conformational 1 Memory 
biased sampling method is that partitioning the random 
number generator among the populated intervals results 
ixl a sampling technique that eliminates t^a^i?^*?* 0 *'" 
15 crossing problem. During the biased samp ling random 
walk i a new tr ial configuration is samp led from the 
Conformational Memory, which can be any part of the 
populated dihedral space , and t^en ^ conforma- 
tion is created by driving the current structure to the 
20 appropriate configuration. He^ce,; of a 

barrier restricting access to any j^rt of the conforma- 
tional space is eliminated in this procedure . Because 
Conformational M^ories are meih f^ dis- 
tributions , the cor relat ioiis among tlp^ di f f erent 
25 flexible torsional ari^lei h^ in the 

averaging process. Nevertheless^ 

Memory biased sampling technique does preferentially 
bring together the higher probability regions of the 
different dihedrals, Thus, the method introduces 

30 average correlations among the different dihedral 

angles during the selection process, while accessing 
all populated regions. It is djnpoxrtant to note that 
the original formulation of the ^Conformational Memories 
biased sampling technique (Guarnieri and Wilson, J. 

35 Comput. Chem 16:648-653 (1995)) violates detailed 

balance. Here, we have corrected the biased sampling 
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so that it obeys detailed balance by multiplying the 
Boltzmann function used in the Metropolis test with the 
factor Plold*P2old/ (Plnew*P2new) , where Plold and P2 old 
are the population percentages of the ten degree inter- 

5 vals of the Conformational Memories of the two dihedral 
angles in the current conformation of the random walk 
(because in this example two dihedrals per step are 
changed). Plnew and P2new are the corresponding 
population percentages of the new dihedral values for 

10 these angles in the new trial conformation. 

p^loome ^ - of Conformational Families 

we performed several sequences of biased sampling 
runs at 310K to determine the best and simplest way to 
15 create representative conformational families for the 
GnRH peptide . The first run was a 10,000 step MC 
random walk using the conformational memory biased 
sampling technique with uniform sampling of 100 struc- 
tures (1 sample every 100 steps) . The second run was a 
20 50, 000 step MC random walk using the Conformational 

Memory biased sampling technique with uniform sampling 
of 100 structures (1 sample every 500 steps) . The 
third and fourth runs were 100,000 and 500, 000 step 
biased sampling runs also sampling 100 structures in 
25 the same manner . Each batch of 100 structures was 
analyzed with the program XCluster (Shenkin and 
McDonald, J. Comput. Chem IS: 899-916 (1994) ). XCluster 
inputs the series of 100 conformations and computes the 
RMS difference between all possible pairs of conforma- 
nt, tions. Structures 2-100 of the input sequence are then 
reordered based on increasing FMS deviation. In the 
new ordering, considering all 100 conformations, con- 
former 2 has the smallest RMS deviation from conf ormer 
1, and conf ormer 3 has the smallest RMS deviation from 
35 conf ormer 2, etc. Xcluster then produces a graphical 

representation of the RMS deviations between every pair 
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of ^ . Since the cdrif orttatl^ been 

rearranged so that the RMS de\riatii^ nearest 
n^^ RMS deviation 

1^ 

5 st^ctural change and hence identif ies a new con- 
formational family. As described below, we settled on 
500,000 steps for the subsequent biased sampling runs • 
We then performed these biased sampli^ ' using ■ 
Conformational Memories created from 25, 50, 75, 100 
10 and 157 run MC/SA simulations. 

Ill - kefults And Discussion 
conformational Families of GnRH 

The 500,000 step biased sampling runs for GnRH 
15 with a sampling rate of 1 every 5,000 structures 

require 4.3 hours per run on a 200 MHz SGI Challenge 
■.'vo^B^t^on^ Structures from the 500 ,000 step biased 
- s conformational families 

as described above. A backbone trace of represen- 
20 tat i very distinct backbone 

conformations that emerged from this procedure is shown 
in Figure 10. Notably, similar results were obtained 
regardless of the origin of the Conformational Memories 
from MC/SA simulations of 25, 50, 75, 100 or 157 runs . 
25 Fanii lies o f conf ormat ions having a beta -turn between 
r^ with a frequency of approximately 

70%. A distribution showing a super imposition of 70 of 
these structures is illustrated in Figure 11 (GnRH is 
coioted in red, with Arg8 colored in green) . The beta- 
30 type turn common to all the structures in this family 
is clearly evident (Fig 11) . In contrast, families 
which havei an extended backbone, occur with a frequency 
of about 5%. The distribution of side chain orien- 
tations of Arg8 in all conformational families is wider 
35 than that of any other residues in GnRH. The results 
of Struthers et al, (Proteins: Structure, Function and 



WO 96/34347 



-26- 



PCT/US96/06110 



Genetics fi: 295-304 (1990)) from the examination of 
different GnRH analogs seem to indicate that an 
arginine is required as part of the pharmacophore. The 
present results, on the other hand, may indicate that 
5 the role of Arg8 in the receptor interaction of GnRH 
could relate to the backbone conformation , rather than 
to its participation in a recognition pharmacophore. 

It is noteworthy that biased sampling runs of 
10,000-25,000 steps resulted in large (uhcbnverged) 
16 fluctuations in the ratio of beta-turn to extended 

backbone conformations. However , more extended biased 
sampling runs of 100, 000-500,000 resulted in negligible 
fluctuations in the ratio of beta-turn to extended 
backbone conformations. Although it appeared from our 
15 calibration studies that 100,000 step biased sampling 
runs are sufficient, we chose to carry out the more 
extensive 500,000 step biased sampling runs for all the 
calculations presented here. 



7 a ^n^^ripai-ional Families of TiVs8~QnRH 

The Lys8 analog of GnRH had been constructed to 
explore the role of Arg8 in molecular recognition of 
GnRH by its receptor (Karten and Rivier, Endo. Rev. 
2:44--66 (1986) ; Millar et al., J. Biolog. Chem 
25 2M:21007-210i3 (1989) ) . Mutation studies of GnRH 

receptors from various species have implicated Arg8 as 
being important for mammalian hormone-receptor recogni- 
tion (Flanagan et al., J. Biolog. Chem. 2£9_: 22636-22641 
(1994)) . To analyze the structural implications of 
30 Arg8 for the activity of GnRH, we compared the 

conformational profile of the peptide hormone with that 
of the mutant Lys8-GnRH which is known to be a low 
affinity GnRH agonist. In contrast to the wild type 
hormone, the major conformational family of the Lys8- 
35 GnRH congener was found to have an extended backbone, 

while the beta-turn conformation exists as a very minor 
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f amily . A backbone trace of a representative of each 
: : : -.:f^^iy.'is shown in Figure 12 . The family of conforna^ 
;tions represented in Figure 12a has an extended 
^.'^SKSbii^ and occurs with a frequency of greater than 
5 70%. The Lys8-GnRH family that has a beta-type turn 
< c^ the backbone (Figure 12b) which is 

Virtiiailly identical to the major conformational family 
of the GnRH (Figure 10a), has a probability of only 
about 3%. A distribution of the members of the 
10 predominant Lys8-GnRH family superimposed upon each 
jdther is shown in Figure 13, with the entire mo lecule 
• shown in red, except for Lys8 which is colored green. 

Lys8-GnRH has a low affinity for the GnRH 
receptor, but elicits the same response once it 
15 interacts with the receptor, it is tempting to suggest 
; that adoption of a large population of beta-type turn 
conformation is a key requirement for hormone-receptor 
recognition. This inference agrees with earlier pro- 
p^ in the literature, and is supported by results 
20 from additional Conformational Memories simulations on 
the structural characterization of eight other GnRH 
Analogs that exhibit different distributions between 
the beta-turn like structures and the fully extended 
conf ormations of the backbone (Guarnieri et al. , 
25 unpublished results). It is particularly noteworthy 
that our simulations lead to the same conclusions 
regarding the importance of the bent structure that 
were drawn from their combined NMR and molecular 
dynamics studies of conformational ly constrained GnRH 
30 analogs (Struthers et al. , Proteins: Structure, 
Function and Genetics £:295-304 (1990). 

Structural Comparison To a Constrained GnRH Analog 
To test the key inference from the present 
35 simulations of GnRH analogs, regarding the correlation 
between the population of beta-type turn structure and 



WO 96/34347 PCT/DS?«»6110 

—28-' 



affinity for receptor, we compared several samples from 
the most populated conformational family of GnRH 
obtained from Conformational Memories to a structurally 
constrained cyclic decapeptide GnRH analog (Baniak et 
5 al ., Biochem 26:2642-2656 (1987)) . The conformation of 
this cyclic decapeptide was determined from NOE data 
using 2D HMR techniques (Baniak et al., Biochem 
2^:2642-2656 (1987 )). These experimental studies con- 
cluded that residues 6 and 7 formed a type II beta-turn 
10 and residues 1 and 2 formed a type II beta-turn. Addi- 
tionally, it was concluded that a weak hydrogen bond 
existed between the Arg8 -NH and the Tyr5 -CO, and a 
stronger hydrogen bond between the D-Trp3-NH and the 
beta-AlalO -CO. To allow for the comparison, a struc- 
15 ture of this GnRH analog was built in Macromodel 4.5 

according to the specifications (Baniak et al., Biochem 
2£: 2642-2656 (1987)), and using the beta-turn defini- 
tions of Hutchinson and Thornton (Hutchinson and 
Thornton, Protein Science 3: 2207-2216 (1994 )). This 
20 reconstructed GnRH analog was compared to the GnRH 

structures obtained from the Conformational Memories 

described above. 

Several of the members of the major conformational 
family of the GnRH obtained from the Conformational 
25 Memories were selected at random and superimposed on 
the reconstructed geometry of the analog, using the 11 
backbone atoms from the Tyr5 -CO to the -N of Pro? i 
All computationally derived structures superimposed on 
the reconstructed structure with RMS deviation in a 
30 range of 0.6-0.8 A. An illustration of the super- 
imposition is shown in Figure 14. Clearly, the 
computationally derived structure is closely related to 
the reconstructed backbone of residues 5-8 of the 
experimentally derived peptide structure. The struc- 
35 tures diverge between the N-terminus and residues, 
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and a super imposition bt^ atom£rr^ in 

a 5 A PMS deviation. 

cnRH conformations From a Buildup Procedure 
5 Recently, (Nikif drovich and Marshall, Int. J. 

Peptide Protein Res. 42:171-180 (1993) ) coni3tructed low 
energy conformations of GnRH using the ECEiPP prbgr^n 
(Dunfield et al. ,"' J. Phys . Chem . £2: 2609-2616 ( 1978) ) . 
We have reconstructed e ight con f ormat ions from the pub- 

10 lished list of backbone dihedral angles arid a list of 

s ide cha in dihedr a Is gracious ly provided by the authors 
(Nikiforovich and Marshall , Int. J. Peptide Protein 
Res. 42:171-180 (1993) ) . The energies of the^e recbn- 
st rue ted peptide structures were compared With 

15 representatives from the three major families of GnRH 
found using Conformational Memories. The optimal 
geometries of GnRH obtained f rom the two 
methods were quite different, and the energies of the 
eight conformations calculated from ECEPP Were 30b-4 00 

20 kJ/mol (20-25%) higher than those calculated from the 
conformations generated using Conformational Memories . 
It is unlikely that this large difference can be attri- 
buted solely to the use of different force fields in 
the definition of optimal conformations, since a recent 

25 comparative study resulted in very similar loy energy 
Met-Enkephalin structures (Montcalm et al. > J. of Hoi. 
Str. (Theochem) 3fl£:37-51 (1994)). However/ a major 
source of difference may be the use of a GB/SA water 
model in the Conformational Memories approach, and 

30 perhaps a more complete exploration of the conforma- 
tional space. 

Exploration of the Unpopulated Regions 

As a stringent test of the completeness of the 
35 conformational exploration, we performed extensive 
sampling from the unpopulated regions of the 
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conformational Memories for several key diAe^al angles 
involved in the formation of the beta-turn of the GnRH. 
With one exception, this sampling produced high energy 
structures in all cases, as expected. The one 
5 interesting exception occurred during the sampling of 
the unpopulated regions of the phi angle of Gly6 . This 
sampling produced a structure only 20 kJ/mol higher in 
energy than the best GnRH structure . The dihedral 
value came from a bin that had a 0.6% population at 
10 345K, but had a 0% population at 3 10K and therefore was 
not included in the populated portion from which the MC 
biased sampling was done. A simple way to avoid 
missing a very low probability low energy structure 
when performing the biased sampling at 310K, is to use 
15 the probability weights from a higher temperature. Our 
exploration of the unpopulated regions of the phi angle 
of Gly6 at temperatures 100t200K above 310K eliminated 
this problem. The small drawback, however, is that 44% 
of the dihedral space of Gly6 is unpopulated at 3 lOK 
20 and only 33% is Unpopulated at 473K. Thus, a safety 
factor during the biased sampling run involves 
exploring about 10% more dihedral space per rotatable 
torsion, but ensures the enclosure of all populated 
areas. Conformational regions that exhibit 0% 
25 population in the calculation Of the isolated peptide 
in water at 310K may still be of biological importance, 
if some of these conformations can be induced by the 
interaction energies of the peptide with the receptor. 
The finding that regions unpopulated at 310K are in 
30 fact populated at temperatures higher by only 100K 
(corresponding to an energy difference of only a 
fraction of a Kcal/mol) , indicates the feasibility of 
such "receptor-induced" conformations. 



35 conclusions 
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j^^p^X±^d : ti:or ^^ d0oapept:i!de tib^^bn^ 0rilUi r the 
method of Conformational Memories Was shown to provide 
a powerful practical solution to the complex problem 
presented by the flexibility of polypeptides with a 
5 large nu^ 

With the^y^ method 
Was shown tb ^ achieving ic^plete sampling 

of the ponf ^ to conVe 

practical ni^er o steps, and to be capable of over- 

10 coming eher^ ^r efficiently. 

The results a. 
relation b^tw^eri ^^e^J^t^^tii^-. structure ideritlf led as 
the majot conformational family of GnRH, and high 
affinity for tthe bhRH receptor . While these inferences 

15 were inherent in the results; for earlier investigations 
of conf ormatiohally restricted GnRH analogs , the 
present study provides unbiased support for this 
mechanistic hypothesis based on a complete exploration 
of the conf ormational space of the peptide hormone 

20 itself and its unconstrained congeners. Because the 
method seems to have produced the lowest energy 
conformers reported for GnRH from a full exploration 
that is econbmica 1 arid practical, its general appli- 
cation to the study of peptide structure-function 

25 relations should continue to produce important mechan- 
istic insights and powerful guides for ligand design. 
TABLE 1 

A sample of the output collected in the history 
f iles of the simulated anneal ing random walks . Column 

30 1 indicates if the data is produced from an accepted or 
rejected step with 0«re jected and l=accepted . The 
second column lists the pair of atom number identifying 
the dihedral angles that were rotated to produce the 
trial structure. The third column lists the extent to 

35 which the dihedral was rotated in order to create the 
trial structure. The fourth column lists the energy of 
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the current conf oration (the energy of the original 
: .:■ structure if rejected or the new structure if ■■ 

accepted). The fifth column lists the current dihedral 
values of the conformation (the dihedral angle of the 
5 original structure if rejected or the new structure if 
accepted) ^':f. 
, Table 2 .; 

; V'"' ; A sample of a conformational memory spreadsheet. V 
The first row labels the dihedral circle across the y- 

10 axis. The first column labels the temperatures across 
the x-axis. Each cell contains the population corres- 
ponding to a given temperature and a given 10 degree 
dihedral bucket which is plotted on the z-axis. Mote 
that the columns of the spreadsheet are cut of f after - 

15 40 degrees . 
Tables 

Excerpt from the population probability matrix for 
GrtRH at 310K. The dimensions of this matrix are 3 5x36 
(35 rotatable dihedral angles, with the population 
26 distribution bf each angle broken into 36 intervals of 
10 degrees) . Note that only 14 rows and 11 columns of 
this matrix are shown in the Table. 

Various publications are cited herein, the con- 
tents of which are hereby incorporated by reference in 
25 their entireties. 
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CLAIMS . 

1 # A m^hbd for identifying 6t^ctuir a l^y act i ve 
mo lecule comprising 

(a) performing multiple simulated annealing runs 

5 in order to reveal populate^! and unpopulated regions of 
multiidimen and # 

(b) performing a simulation at a fixed temper^/- • 
ature, with sampling only from populated regions found 

. in the first step. ... 
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